Skip to content

Conversation

@jeffersoncasimir
Copy link
Contributor

@jeffersoncasimir jeffersoncasimir commented Oct 20, 2025

Closes #9944.
Adds project sizes to the dashboard.

File sizes are read from a cached value in a new DB table for cached data.
Cache data is updated by cron or webhook via a new script called update_projects_disk_space.php.
This script recursively calculates the file sizes of a dataset recursively by using the php filesize function. It only ignores .tgz files, which are not BIDS-recognized and used for LORIS purposes.

Screenshots:
Screenshot 2025-10-20 at 5 58 23 PM
Screenshot 2025-10-20 at 5 58 26 PM
Screenshot 2025-10-20 at 5 58 42 PM

@jeffersoncasimir jeffersoncasimir added Module: statistics PR or issue related to statistics module Module: dashboard PR or issue related to dashboard module labels Oct 20, 2025
@github-actions github-actions bot added Language: SQL PR or issue that update SQL code RaisinBread PR or issue introducing/requiring improvements to the Raidinbread dataset Language: PHP PR or issue that update PHP code Language: Javascript PR or issue that update Javascript code Module: dqt PR or issue related to (old) dqt module Module: behavioural_qc PR or issue related behavioural_qc module Module: candidate_list PR or issue related to candidate_list module Module: imaging_browser PR or issue related to imaging_browser module labels Oct 20, 2025
@ridz1208
Copy link
Collaborator

@jeffersoncasimir can you add a description here please. Mostly commenting on the design, what is being calculated in the file size. and how to set it up. I glanced quickly at the code and I see several SQL additions and a tool script which is telling me ur caching the sizes rather than calculating them on the fly? assuming its for speed considerations? If a project has DICOMS, NIFTIs and MINCs does it calculate the sum of all those even though its technically the same scans...

@jeffersoncasimir
Copy link
Contributor Author

jeffersoncasimir commented Oct 21, 2025

@ridz1208 I will add more info above. Ultimately, PHP filesize() function is being used here in the script intended to be run via cron or webhook

@CamilleBeau CamilleBeau added the State: Needs rebase PR that needs to be rebased to proceed (conflicts, wrong branch...) label Oct 27, 2025
@jeffersoncasimir jeffersoncasimir force-pushed the 2025_09_11_project_size_chart branch from ef15ea3 to f85a0b1 Compare October 27, 2025 21:02
Copy link
Collaborator

@driusan driusan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This widget has been translated and we need to be sure there aren't regressions on the dashboard.

Please rebase the PR (and then some of the comments/fixes might also make more sense and be easier to understand)

@jeffersoncasimir jeffersoncasimir force-pushed the 2025_09_11_project_size_chart branch from 263a893 to 15842fa Compare October 29, 2025 16:42
@jeffersoncasimir
Copy link
Contributor Author

@driusan I moved the file from dqt to statistics (accidental module choice) and I took a pass at making all strings I encountered translatable

@jeffersoncasimir jeffersoncasimir added the State: Blocking PR should be prioritized because it is blocking the progress of another task label Oct 31, 2025
@driusan driusan self-assigned this Nov 10, 2025
@christinerogers
Copy link
Contributor

per EEG meeting Oct. 31:
This PR is blocking Draft pr #10093 -- which is only in draft form waiting for this to be merged.
Ideally these could both go in the release, no reason why not.

@jeffersoncasimir who might be able to quickly review this for merge -- Saagar maybe?

filters: '',
chartType: 'pie',
dataType: 'pie',
label: t('Size (GB)', {ns: 'statistics'}),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this in the .po file, nor .pot file

'project_sizes': {
'size_byproject': {
sizing: 11,
title: t('Dataset size breakdown by project', {ns: 'statistics'}),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also not in the locale files

Comment on lines +245 to +247
$values['eeg_data'] = [
'total_recordings' => $eeg_data
];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used anywhere?

Copy link
Contributor

@skarya22 skarya22 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments above re translation. Rest looks good, was able to test with a few different project sizes

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Language: Javascript PR or issue that update Javascript code Language: PHP PR or issue that update PHP code Language: SQL PR or issue that update SQL code Module: behavioural_qc PR or issue related behavioural_qc module Module: candidate_list PR or issue related to candidate_list module Module: dashboard PR or issue related to dashboard module Module: dqt PR or issue related to (old) dqt module Module: imaging_browser PR or issue related to imaging_browser module Module: statistics PR or issue related to statistics module RaisinBread PR or issue introducing/requiring improvements to the Raidinbread dataset State: Blocking PR should be prioritized because it is blocking the progress of another task State: Needs rebase PR that needs to be rebased to proceed (conflicts, wrong branch...)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[dashboard] Add plot for file format and project sizes

6 participants